Reproducible Computational Environments Using Containers: Introduction to Docker and Singularity

Times:

Tuesday 16 April 2024 10:00 - 15:30
Wednesday 17 April 2024 10:00 - 14:30

Location:

Room LG14
Murray Learning Centre
Edgbaston Campus
Birmingham
B15 2FG
Murray Learning Centre location

Instructor: Juan Herrera (EPCC, University of Edinburgh)

This session aims to introduce the use of Docker and Singularity containers with the goal of using them to effect reproducible computational environments. Such environments are useful for ensuring reproducible research outputs, for example.

After completing this session you should:

Have an understanding of what containers are, why they are useful and the common terminology used

Have a working Docker installation on your local system to allow you to use containers

Understand how to use existing Docker and Singularity containers for common tasks

Be able to build your own Docker containers by understanding both the role of a Dockerfile in building containers, and the syntax used in Dockerfiles

Understand the key differences between Docker and Singularity containers to allow you to use them effectively

Understand how to manage Docker containers on your local system and Singularity containers on a remote HPC system

Appreciate issues around reproducibility in software, understand how containers can address some of these issues and what the limits to reproducibility using containers are

The practical work in this lesson will use Docker on your own laptop and Singularity on a remote HPC platform. Beyond your laptop, software container technologies such as Docker can also be used in the cloud and on high performance computing (HPC) systems. Some of the material in this lesson will be applicable to those environments too.

Prerequisites

You should have basic familiarity with using a command shell, and the lesson text will at times request that you “open a shell window”, with an assumption that you know what this means.

Under Linux or macOS it is assumed that you will access a bash shell (usually the default), using your Terminal application.

Under Windows, Powershell and Git Bash should allow you to use the Unix instructions. We will also try to give command variants for Windows cmd.exe.

The lessons will sometimes request that you use a text editor to create or edit files in particular directories. It is assumed that you either have an editor that you know how to use that runs within the working directory of your shell window (e.g. nano), or that if you use a graphical editor, that you can use it to read and write files into the working directory of your shell.

A note about Docker and Singularity

Docker and Singularity are mature, robust and very widely used application. Nonetheless, they are still under extensive development. New versions are released regularly often containing a range of updates and new features.

While we do our best to ensure that this lesson remains up to date and the descriptions and outputs shown match what you will see, inconsistencies can occur.

If you spot inconsistencies or encounter any problems, please do report them by opening an issue in the GitHub repository for this lesson.

Schedule

		Setup	Download files required for the lesson
Day 1	10:00	1. Introducing Containers	What are containers, and why might they be useful to me?
	10:20	2. Introducing the Docker Command Line	How do I know Docker is installed and running? How do I interact with Docker?
	10:30	3. Break	Break
	11:00	4. Exploring and Running Containers	How do I interact with Docker containers and container images on my computer?
	11:30	5. Cleaning Up Containers	How do I interact with a Docker container on my computer? How do I manage my containers and container images?
	11:40	6. Finding Containers on Docker Hub	What is the Docker Hub, and why is it useful?
	12:00	7. Break	Break
	13:00	8. Creating Your Own Container Images	How can I make my own Docker container images? How do I document the ‘recipe’ for a Docker container image?
	13:35	9. Creating More Complex Container Images	How can I make more complex container images?
	14:35	10. Break	Break
	15:05	11. Examples of Using Container Images in Practice	How can I use Docker for my own work?
	15:25	Finish
Day 2	10:00	12. Singularity: Getting started	What is Singularity and why might I want to use it?
	10:25	13. Using Singularity containers to run commands	How do I run different commands within a container? How do I access an interactive shell within a container?
	10:40	14. Break	Break
	11:10	15. Using Docker images with Singularity	How do I use Docker images with Singularity?
	11:25	16. The Singularity cache	Why does Singularity use a local cache? Where does Singularity store images?
	11:35	17. Files in Singularity containers	How do I make data available in a Singularity container? What data is made available by default in a Singularity container?
	11:55	18. Break	Break
	12:55	19. Using Singularity to run BLAST+	How can I use Singularity to run bioinformatics workflows with BLAST+?
	13:55	20. Containers in Research Workflows: Reproducibility and Granularity	How can I use container images to make my research more reproducible? How do I incorporate containers into my research workflow?
	14:20	21. Break	Break
	14:50	22. (Optional) Running MPI parallel jobs using Singularity containers	How do I set up and run an MPI job from a Singularity container?
	16:00	23. (Optional) Additional topics and next steps	How do I understand more on how containers work? What different container technologies are there and what are differences/implications? How can I orchestrate different containers?
	16:00	Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.